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(57) ABSTRACT 

An apparatus comprising a plurality of interface circuits, a 
plurality of transmit outputs and a plurality of receive inputs. 
The plurality of interface circuits each comprises (i) a 
transmit circuit and (ii) a receive circuit. One of the plurality 
of transmit outputs is generally connected to one of the 
plurality of receive circuits. One of the plurality of receive 
inputs is generally connected to one of the plurality of 
transmit circuits. In general, each one of the plurality of the 
transmits outputs are generally connected to one of the 
plurality of the receive inputs. 
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HIGHLY SCALABLE ARCHITECTURE FOR SUMMARY OF THE INVENTION 

IMPLEMENTING SWITCH FABRICS WITH ~ „ wp . - n ftTi __ tnc rn 

riTTAT ttv riF c™vTmr« 1116 P resent invention concerns an apparatus composing 

%2 \JAL,i i x M^KVi^ca a plurality of interface circuits, a plurality of transmit 

CROSS-REFERENCE TO RELATED 5 ^ te and a P 1 ™^ of ^c^ive inputs. The plurality of 

APPLICATIONS interface circuits each comprises (1) a transmit circuit and 

(ii) a receive circuit. One of the plurality of transmit outputs 

This application may relate to co-pending U.S. applica- is generally connected to one of the plurality of receive 

tion Ser. No. 09/347,046, filed Jul. 2, 1999; and U.S. circuits. One of the plurality of receive inputs is generally 

application Ser. No. 09/347,045, filed Jul. 2, 1999, which are 1Q connected to one of the plurality of transmit circuits. In 

each hereby incorporated by reference in their entirety. general, each one of the plurality of the transmits outputs are 

generally connected to one of the plurality of the receive 

FIELD OF THE INVENTION inputs. 

The present invention relates to communication switching ^ objects, features and advantages of the present inven- 

devices generally and, more particularly, to a highly scalable 15 mclude providing a communication interface that may 

architecture for implementing switch fabrics with quality of W eliminate parallel mterfaoes from the system allowing 

services, more scalable solution, (ii) not require a separate switch 

fabric chip, (iii), be created by connecting the individual 

BACKGROUND OF THE INVENTION elements together, (iv) reduce the number of routes on the 

2Q board which may reduce the board cost, (v) reduce the chip 

Referring to FIG. 1, a block diagram of a circuit 10 is fof thc system ^ (vi) reduce powef 

shown implementing a conventional crossbar switch fabric. 

A number of ports Ma-Mn are shown connected to a switch BRIEF DESCRIPTION OF THE DRAWINGS 
fabric 14. The port 12a is shown comprising a serializer/ 

deserializer block 16, a storage buffer 18, a scheduler 20, a "H** and other ob i ects > featurcs advantages of the 

packet classifier 22, a queue manager 24, a packet classifier 25 P rcscnt mention will be apparent from the following 

26, a queue manager 28 and a storage buffer 30. Each of the detail ? d description and the appended claims and drawings 

ports 12a-12n has similar components. A parallel bus 32 m wmcn: 

transmits data from the port 12a to the switch fabric 14. FIG. 1 is a block diagram of a conventional communica- 

Similarly, a parallel bus 34 receives data from the switch tion switching device; 

fabric 14. A serial link 36 receives data from a line card (not 30 FIG. 2 is a diagram of a preferred embodiment of the 

shown) and a serial link 38 transmits data to the line card. present invention; and 

For the transmit side, the data arrives from the line card FIG. 3 is a diagram of an implementation of the preferred 

through the serial link 36. The data is deserialized into embodiment in the present invention, 

parallel data by the serializer/deserializer circuit 16 and then 35 

presented to the packet classifier 22. The packet classifier 22 DETAILED DESCRIPTION OF THE 

looks at the information embedded within the packet data PREFERRED EMBODIMENTS 

and determines the appropriate outgoing port 12a-12n that Referring to FIG. 2, a block diagram of a circuit 100 is 

will receive the packet data. The packet classifier 22 may s h own in accordance with a preferred embodiment of the 

also determine the priority of the packet data from the ^ present invention. The circuit 100 generally comprises a 

embedded information. The queue manager 24 informs the receive block (or circuit) 102 and a transmit block (or 

scheduler 20 about the new packet arrival. The packet is circuit) 104. The receive circuit 102 may be implemented as 

stored in the storage buffer 18 until the packet is scheduled a receive switch fabric element. The transmit circuit 104 

to go to the appropriate port 12a-12* through the switch may be implemented as a transmit switch fabric element, 

fabric 14. The scheduler 20 of each port 12^12* commu- 45 ^ receive switch fabdc element 102 ^ CQm . 

nicates with the port schedulers of the other ports 12a-12n rises a cifcuit m a multi . queue storage circuil 

and, based a predetennmed algorithm, schedules packets m a classifier circuit 114, a receive circuit 116 and 

from all the incoming ports 12^-12/! ^to the outgoing ports a multiplexer 118. ITie transmit circuit 110, the 

12a-12« through the switch fabric 14. multi-queue storage circuit 112 and the queue classifier 114 

The packet classifier 22 and the queue manager 28 are 50 may be implemented, in one example, as a single chip 115. 

normally implemented in an application specific integrated Similarly, the transmit switch fabric element 104 generally 

circuit (ASIC) or a field programmable gate array (FPGA). comprises a receive circuit 120, a queue classifier circuit 

Similarly, the scheduler 20 is normally implemented in an 122, a multi-queue storage circuit 124, a transmit circuit 126 

ASIC or an FPGA. The storage buffers 18 and 30 are and a selectable multiplexer 128. The receive circuit 120, the 

normally implemented using dual port memories. The 55 queue classifier 122 and the multi-queue storage element 

switch fabric 14 is a large pin count cross bar chip or is 124 may be, in one example, implemented as a single chip 

constructed using PLDs to implement a multiplexer function 125. In another example, two or more of the transmit circuits 

with control signals. The receive side has a similar operation no, the multi-queue storage circuit 112, the queue classifier 

provided by the packet classifier 26, the queue manager 28 circuit 114, the receive circuit 116 and the selectable mul- 

and the storage buffer 30. However, the receive side only has 60 tiplexer 118 may be implemented as a single integrated 

to process priority information and not port information. circuit. Similarly, two or more of the receive circuit 120, the 

The performance of the circuit 10 is limited by the speed queue classifier circuit 122, the multi-queue storage circuit 

and width of the circuit 10. To increase operating speed to 124, the transmit circuit 126 and the selectable multiplexer 

a higher bandwidth requires either higher interface speed or 128 may be implemented as a single integrated circuit. In yet 

an increased bus width of the switch fabric 14. Additionally, 65 another example, two or more of the transmit circuit 110, the 

this configuration requires a switch fabric chip 14 to connect multi-queue storage circuit 112, the queue classifier circuit 

ports for switching. 114, a receive circuit 116, the selectable multiplexer 118, the 
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1 is not exposed at all. Thus, the limitations associated with 
the parallel interface of FIG. 1 may be eliminated. The 
circuit 100 may reduce the number of routes significantly as 
compared to the parallel interface of FIG. 1 because of the 
elimination of the parallel connections from one chip (e.g., 
the port 12a) to another (e.g., the port 126). The example 
shown in the following TABLES 1 and 2 illustrates calcu- 
lations for crossbar switch fabric and mesh switch fabric for 
2.5 Gbps serial link for 4, 8, 16 and 32 port configurations. 

TABLE 1 



receive circuit 120, the queue classifier circuit 122, the 
multi-queue storage circuit 124, the transmit circuit 126 and 
the selectable multiplexer 128 may be implemented as a 
single integrated circuit. Additionally, various sub- 
combinations of the transmit circuit 110, the multi-queue 
storage circuit 112, the queue classifier circuit 114, a receive 
circuit 116, the selectable multiplexer 118, the receive circuit 
120, the queue classifier circuit 122, the multi -queue storage 
circuit 124, the transmit circuit 126 and the selectable 
multiplexer 128 may be implemented as two or more 10 
integrated circuits. 

In the transmit switch fabric element 104, data is gener- 
ally received from a line card (not shown) through a serial 
link 130 and converted into parallel data. The parallel data 
may then be presented to the queue classifier 122 which may 15 
determine the outgoing port information (and/or priority 
information) from embedded information in the data. The 
port information may then be presented to the multi-queue 
storage device 124. The multi-queue storage device 124 may 
be implemented as a queue manager and a storage buffer 20 
combined in one circuit. An example of the multi-queue 
storage device 124 may be found in co-pending application 
Ser. No. 09/347,046, filed Jul. 2, 1999, which is hereby 
incorporated by reference in its entirety. A queue manager 
portion may be constructed to support different queues for 25 
each output and for each priority. The ability to provide 
multiple priorities for each output may enable the multi- 
queue storage device 124 to provide quality of service 
(QoS). A scheduler portion (to be described in more detail in 
connection with FIG. 3) may provide the information about 30 
the outgoing port to the multi-queue portion 124 and to the 
selectable multiplexer circuit 128. Similarly, the scheduler 
may provide the information about the incoming port to the 

multi-queue portion 112 and to the selectable multiplexer .... t_i f * *i 

circuit 118. The information about the outgoing and/or 35 «™>I* **** of the circuit of HG. 1 the 

66 bus width for RX/TX would be 25 each. 



Connections 


4 


8 


16 


32 


Serial Link 


16(4*2*2) 


32 


64 


128 


Serial link — > PC/QM 


200(4*50) 


400 


800 


1600 


PC/QM — > Storage Buf 


200 


400 


800 


1600 


Storage Buf — > SF 


200 


400 


400 


1600 


Total 


616 


1232 


2464 


4928 


Bandwidth 


10G 


20G 


40G 


80G 


TABLE 2 


Connections 


4 


8 


16 


32 


Serial Link 


16 


32 


64 


128 


Switch Fabric 


24 


112 


4S0 


1984 


Total 


40 


144 


544 


2112 


Bandwidth 


10G 


20G 


40G 


80G 



The parallel bus speed used in the example is 100 MHz. 
For 2.5 Gbps bandwidth, a 25 pin wide bus would be 
required using the circuit of FIG. 1. In the present invention, 
each serial connection uses two routes. The transmit and 
receive generally doubles the number of routes. For 



incoming port may be communicated to the multi-queue 
portion 124 (or 112) and the selectable multiplexer circuit 
128 (or 118) through one or more interfaces. The data is then 
sent to the outgoing port through one or more outputs 
132fl-132n. 

In the receive switch fabric element 102, the scheduler 
selects an input 134a-134/i from which data is to be recov- 
ered. The data may be presented to the multi-queue storage 
element 112 to store the data with different levels of priority 
for supporting quality of service. The data may then be 
transmitted to the line card through a serial link 136. 

FIG. 3 illustrates how a number of receive switch fabric 
elements 102 and a number of transmit switch fabric ele- 



TABLE 1 shows the total number of routes for the old 
method. The first row and first column shows how the 
calculations were derived. A pair of serial links for RX/TX 
40 for 4 ports results in 16 connections. TABLE 2 shows 
route/connection calculations for the present invention. The 
comparison of TABLE 1 and TABLE 2 shows that, for the 
same bandwidth, the number of connections/routes required 
are significantly less when implemented with the present 
45 invention. The chip count for the present invention will also 
be significantly less than the chip count of the circuit of FIG. 
1. For example the circuit of FIG. 1 requires 2 chips for the 
packet classifier (PC)/Queue manager (QM) 22/24 and 
28/26, two Dual port memories (i.e., 18, 30), one Serial/ 



ments 104 may be combined in a number of interface 5Q deserializer 16, one scheduler 20 and one PLD to implement 



circuits lOOa-lOOn. A scheduler 150 may be implemented in 
each of the interface circuits 100a-100/i of the interface 
circuits. The scheduler 150 may be configured to control the 
priority and port direction of the transmit switch fabric 
element 104 and the receive switch fabric element 102. 

The interface circuits lOOa-lOOn may be connected 
together through a number of links 160a-160n to function as 
a switch fabric. In general, each of the interface circuits 
lOOa-lOOw is directly connected to each of the other inter- 
face circuits lOOa-lOOw. The switch fabric function is 
implemented as only a number of routes (e.g., connections 
or Links) on the board. In one example, the links 160a-160tw 
may be implemented as one or more high speed serial links. 

The transmit switch fabric element 104 and the receive 



switch fabric (multiplexer) 14 for each port 12a-12n. This 
implies seven chips per port or for 4, 8, 16, 32 port switch 
fabric, 28, 56, 112 and 224 chips, respectively. 
However, even if the transmit switch fabric element 104 
55 and the receive switch fabric element 102 are implemented 
as separate chips, the present invention would require three 
chips per port, including the scheduler. For a 4, 8, 16 and 32 
port switch fabric, the present invention would require 12, 
24, 48 and 96 chips, which is significantly smaller than the 
eo old method. When the transmit switch fabric element 104 
and the receive switch fabric element 102 are integrated into 
a single chip, the chip count will further drop to 8, 16, 32 and 
64, respectively. A lower chip count and smaller number of 
outputs toggling will also result in power reduction of the 



switch fabric element 102 may be implemented as a chip set 65 system, 
for the port or may be integrated into a single chip if the While the invention has been particularly shown and 

technology permits. In this case the parallel interface of FIG. described with reference to the preferred embodiments 
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thereof, it will be understood by those skilled in the art that 
various changes in form and details may be made without 
departing from the spirit and scope of the invention. 
What is claimed is: 

1. An apparatus comprising: 

a plurality of interface circuits each comprising (i) a 
transmit circuit comprising a plurality of transmit 
outputs, (ii) a receive circuit comprising a plurality of 
receive inputs and (in) a scheduler circuit configured to 
control said transmit and receive circuits, wherein each 
one of said plurality of transmit outputs of one of said 
plurality of interface circuits is connected to one of said 
plurality of receive inputs of another of said plurality of 
interface circuits. 

2. The apparatus according to claim 1, wherein said 
transmit outputs and said receive inputs are connected 
through a plurality of serial links. 

3. The apparatus according to claim 1, wherein each of 
said transmit circuits comprises: 

a receive element; 

a queue classifier couplable to said receive element; 
a storage element couplable to said queue classifier; 
a transmit element couplable to said storage element; and 
a selectable multiplexer configured to couple said transmit 
element to said plurality of transmit outputs. 

4. The apparatus according to claim 3, wherein two or 
more of said receive element, said queue classifier, said 
storage element, said transmit element and said selectable 
multiplexer are integrated as a single integrated circuit. 

5. The apparatus according to claim 1, wherein each of 
said receive circuits comprises: 

a receive element; 

a queue classifier couplable to said receive element; 
a storage element couplable to said queue classifier; 
a transmit element couplable to said storage element; and 
a selectable multiplexer configured to couple said plural- 
ity of receive inputs to said receive element. 

6. The apparatus according to claim 5, wherein two or 
more of said receive element, said queue classifier, said 
storage element, said transmit element and said selectable 
multiplexer are integrated as a single integrated circuit. 

7. The apparatus according to claim 1, wherein: 

each of said transmit circuits comprises: (i) a first receive 
element, (ii) a first queue classifier, (iii) a first storage 
element, (iv) a first transmit element, and (v) a first 
selectable multiplexer; and 

each of said receive elements comprises (i) a second 
receive element, (ii) a second queue classifier, (iii) a 
second storage element, (iv) a second transmit element 
and (v) a second selectable multiplexer. 

8. The apparatus according to claim 7, wherein two or 
more of said first receive element, said first queue classifier, 
said first storage element, said first transmit element, said 
first selectable multiplexer, said second receive element, 
said second receive element, said second storage element, 
said second transmit element and said second selectable 
multiplexer are integrated as a single integrated circuit. 

9. The apparatus according to claim 1, wherein said 
scheduler circuit is configured to control priority and port 
direction of said transmit and receive circuits. 

10i The apparatus according to claim 3, wherein two or 
more of said receive element, said queue classifier, said 



15 



25 



35 



40 



45 



50 



60 



storage element, said transmit element and said selectable 
multiplexer are implemented as separate circuits. 

11. The apparatus according to claim 5, wherein two or 
more of said receive element, said queue classifier, said 
storage element, said transmit element and said selectable 
multiplexer are implemented as separate circuits. 

12. The apparatus according to claim 1, wherein each of 
said transmit and receive circuits comprise a multi-queue 
storage element. 

13. An apparatus comprising: 

a plurality of interface means each comprising (i) transmit 
means comprising a plurality of transmit outputs, (ii) 
receive means comprising a plurality of receive inputs 
and (iii) scheduler means configured to control said 
transmit means and said receive means, wherein each 
one of said plurality of transmit outputs of one of said 
plurality of interface means is connected to one of said 
plurality of receive inputs of another of said plurality of 
interface means. 

14. A method for providing a switch fabric, comprising 
the steps of: 

(A) providing a plurality of interface circuits each com- 
prising (i) a transmit circuit having a plurality of 
transmit outputs, (ii) a receive circuit having a plurality 
of receive inputs and (iii) a scheduler circuit configured 
to control said transmit and receive circuits; and 

(B) connecting each one of said plurality of transmit 
outputs of one of said plurality of interface circuits to 
one of said plurality of receive inputs of another of said 
plurality of interface circuits. 

15. The method according to claim 14, wherein said 
transmit outputs and said receive inputs are connected 
through a plurality of serial links. 

16. The method according to claim 14, wherein step (A) 
further comprises: 

providing a first receive element; 
providing a first queue classifier; 
providing a first storage element; 
providing a first transmit element; and 
providing a first selectable multiplexer. 

17. The method according to claim 16, wherein two or 
more of said first receive element, said first queue classifier, 
said first storage element, said first transmit element and said 
first selectable multiplexer are integrated as a single inte- 
grated circuit. 

18. The method according to claim 16, wherein step (A) 
further comprises: 

providing a second receive element; 
providing a second queue classifier; 
providing a second storage element; 
providing a second transmit element; and 
providing a second selectable multiplexer. 

19. The method according to claim 18, wherein two or 
more of said second receive element, said second queue 
classifier, said second storage element, said second transmit 
element and said second selectable multiplexer are inte- 
grated as a single integrated circuit. 

20. The method according to claim 14, wherein said 
scheduler circuit is configured to control priority and port 
direction of said transmit and receive circuits. 
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